Using cross-language cues for story-specific language modeling

نویسندگان

  • Sanjeev Khudanpur
  • Woosung Kim
چکیده

We propose methods to exploit contemporary news articles in a resource rich language, together with cross-language information retrieval and machine translation, to sharpen language models for a news story in a language with fewer linguistic resources. We report experimental results on storyspecific Chinese language models that use cues from a parallel corpus of English news stories. We demonstrate that even with fairly crude cross-language information retrieval, level-1 machine translation and simple linear interpolation, a significant (18%) reduction in perplexity may be obtained over a Chinese trigram model. We also demonstrate that this method of sharpening the Chinese language model is complementary to other techniques like topic dependent modeling, and the two in combination result in an even greater reduction in perplexity (28%).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Cross-language Cues for Story-sp

We propose methods to exploit contemporary news articles in a resource rich language, together with cross-language information retrieval and machine translation, to sharpen language models for a news story in a language with fewer linguistic resources. We report experimental results on storyspecific Chinese language models that use cues from a parallel corpus of English news stories. We demonst...

متن کامل

Contemporaneous text as side-information in statistical language modeling

We propose new methods to exploit contemporaneous text, such as on-line news articles, to improve language models for automatic speech recognition and other natural language processing applications. In particular, we investigate the use of text from a resource-rich language to sharpen language models for processing a news story or article in a language with scarce linguistic resources. We demon...

متن کامل

Architecture Narration: A Comparative Study on Narration in Architecture and Story

The way architects think about different issues from developing plans, perspectives, and views to cross-sections and structure of a building is a common and general one. Regardless of its merits and efficiency, this way of thinking indicates a degradation in architectural thinking. Indeed, architectures today are caught in a specific architecture language where the boundaries of language create...

متن کامل

Cross-language similarities and differences in the uptake of place information.

Cross-language differences in the use of coarticulatory cues for the identification of fricatives have been demonstrated in a phoneme detection task: Listeners with perceptually similar fricative pairs in their native phoneme inventories (English, Polish, Spanish) relied more on cues from vowels than listeners with perceptually more distinct fricative contrasts (Dutch and German). The present g...

متن کامل

Cross-linguistic Validation of Processability Theory: The Case of EFL Iranian Students’ Speaking Skill

Abstract This study investigated the validity of processability theory proposed by Pienemann (1998/2015) among Iranian EFL learners’ oral performance. The theory defines six procedural stages for learners in the process of second language acquisition. In order to conduct the study, 10 intermediate EFL learners were selected based on their performance on Oxford Placement Test. Then, they partici...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002